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Abstract 

In low-dimensional topology, many important decision algorithms are based on normal surface 
enumeration, which is a form of vertex enumeration over a high-dimensional and highly degen- 
erate polytope. Because this enumeration is subject to extra combinatorial constraints, the only 
practical algorithms to date have been variants of the classical double description method. In 
this paper we present the first practical normal surface enumeration algorithm that breaks out of 
the double description paradigm. This new algorithm is based on a tree traversal with feasibility 
and domination tests, and it enjoys a number of advantages over the double description method: 
incremental output, significantly lower time and space complexity, and a natural suitability for 
parallelisation. Experimental comparisons of running times are included. 
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1 Introduction 

Much research in low-dimensional topology has been driven by decision problems. Examples from 
knot theory include unknot recognition (given a polygonal representation of a knot, decide whether it 
is equivalent to a trivial unknotted loop), and the more general problem of knot equivalence (given 
two polygonal representations of knots, decide whether they are equivalent to each other). Analogous 
examples from 3-manifold topology include 3-sphere recognition (given a triangulation of a 3-manifold, 
decide whether this 3-manifold is the 3-dimensional sphere), and the more general homeomorphism 
problem (given two triangulations of 3-manifolds, decide whether these 3-manifolds are homcomorphic, 
i.e., "topologically equivalent"). 

Significant progress has been made on these problems over the past 50 years. For instance, Haken 
introduced an algorithm for recognising the unknot in the 1960s |22) . and Rubinstein presented the 
first 3-sphere recognition algorithm in the early 1990s 35 . Since Perelman's recent proof of the 
geometrisation conjecture [29J, we now have a complete (but extremely complex) algorithm for the 
homeomorphism problem, pieced together though a diverse array of techniques by many different 
authors |25ll3"l]. 

A key problem remains with many 3-dimensional decision algorithms (including all of those men- 
tioned above): the best known algorithms run in exponential time (and for some problems, worse), 
where the input size is measured by the number of crossings in a knot diagram, or the number of 
tetrahedra in a 3-manifold triangulation. This severely limits the practicality of such algorithms. For 
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instance, it took around 30 years to resolve Thurston's well-known conjecture regarding the Weber- 
Seifert space [5] — although an algorithmic solution was known in the 1980s [57] , it took another 25 years 
of development before the necessary computations could be performed [14] . Here the input size was 
n = 23. 

The most successful machinery for 3-dimensional decision problems has been normal surface theory. 
Normal surfaces were introduced by Kneser [3U] and later developed by Haken for algorithmic use 
[121 [33], and they play a key role in all of the decision algorithms mentioned above. In essence, they 
allow us to convert difficult "topological searches" into a "combinatorial search" in which we enumerate 
vertices of a high-dimensional polytope called the projective solution spaced In Section [2] we outline 
the combinatorial aspects of the projective solution space; see |24j for a broader introduction to normal 
surface theory in a topological setting. 

For many topological decision problems, the computational bottleneck is precisely the vertex enu- 
meration described above. Vertex enumeration over a polytope is a well-studied problem, and several 
families of algorithms appear in the literature. These include backtracking algorithms [J] 119) , pivot- 
based reverse search algorithms [HE] [18], and the inductive double description method [34] . All have 
worst-case running times that are super-polynomial in both the input and the output size, and it is not 
yet known whether a polynomial time algorithm (with respect to the combined input and output size) 
existsH For topological problems, the corresponding running times are exponential in the "topological 
input size" ; that is, the number of crossings or tetrahedra. 

Normal surface theory poses particular challenges for vertex enumeration: 

• The projective solution space is a highly degenerate polytope (that is, a d-dimcnsional polytope 
in which vertices typically belong to significantly more than d facets). 

• Exact arithmetic is required. This is because each vertex of the projective solution space is a ra- 
tional vector that effectively "encodes" a topological surface, and topological decision algorithms 
require us to scale each vertex to its smallest integer multiple in order to extract the relevant 
topological information. 

• We are not interested in all of the vertices of the projective solution space, but just those 
that satisfy a powerful set of conditions called the guadrilateral constraints. These are extra 
combinatorial constraints that eliminate all but a small fraction of the polytope, and an effective 
enumeration algorithm should be able to enforce them as it goes, not simply use them as a 
post-processing output filter. 

All well-known implementations of normal surface enumeration [5] [TC] are based on the double 
description method: degeneracy and exact arithmetic do not pose any difficulties, and it is simple to 
embed the quadrilateral constraints directly into the algorithm [T2]. Section [3] includes a discussion of 
why backtracking and reverse search algorithms have not been used to date. Nevertheless, the double 
description method does suffer from significant drawbacks: 

• The double description method can suffer from a combinatorial explosion: even if both the input 
and output sizes are small, it can create extremely complex intermediate polytopes where the 
number of vertices is super-polynomial in both the input and output sizes. This in turn leads 

1 For some topological algorithms, vertex enumeration is not enough: instead we must enumerate a Hilbert basis for 
a polyhedral cone. 

2 Efficicnt algorithms do exist for certain classes of polytopes. For example, in the case of non-degenerate polytopes, 
the reverse search algorithm of Avis and Fukuda has virtually no space requirements beyond storing the input, and has 
a running time polynomial in the combined input and output size [3] . 
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to an unwelcome explosion in both time and memory use, a particularly frustrating situation 
for normal surface enumeration where the output size is often small (though still exponential in 
general) [TP] . 

• It is difficult to parallelise the double description method, due to its inductive nature. There 
have been recent efforts in this direction |37], though they rely on a shared memory model, and 
for some polytopes the speed-up factor plateaus when the number of processors grows large. 

• Because of its inductive nature, the double description method typically does not output any 
vertices until the entire algorithm is complete. This effectively removes the possibility of early 
termination (which is desirable for many topological algorithms), and it further impedes any 
attempts at parallelisation. 

In this paper we present a new algorithm for normal surface enumeration. This algorithm belongs 
to the backtracking family, and is described in detail in Section[3] Globally, the algorithm is structured 
as a tree traversal, where the underlying tree is a decision tree indicating which coordinates are zero 
and which are non-zero. Locally, we move through the tree by performing incremental feasibility tests 
using the dual simplex method (though interior point methods could equally well be used) . 

Fukuda et al. [19] discuss an impediment to backtracking algorithms, which is the need to solve the 
NP-complete restricted vertex problem. We circumvent this by ordering the tree traversal in a careful 
way and introducing domination tests over the set of previously-found vertices. This introduces super- 
polynomial time and space trade-offs, but for normal surface enumeration these trade-offs are typically 
not severe (we quantify this both theoretically and empirically within Sections [5] and [6] respectively) . 

Our algorithm is well-suited to normal surface enumeration. It integrates the quadrilateral con- 
straints seamlessly into the tree structure. Moreover, we are often able to perform exact arithmetic 
using native machine integer types (instead of arbitrary precision integers, which are significantly 
slower). We do this by exploiting the sparseness of the equations that define the projective solution 
space, and thereby bounding the integers that appear in intermediate calculations. The details appear 
in Section |H 

Most significantly, this new algorithm is the first normal surface enumeration algorithm to break 
out of the double description paradigm. Furthermore, it enjoys a number of advantages over previous 
algorithms: 

• The theoretical worst-case bounds on both time and space complexity are significantly better for 
the new algorithm. In particular, the space complexity is a small polynomial in the combined 
input and output size. See Section [5] for details. 

• The new algorithm lends itself well to parallelisation, including both shared memory models and 
distributed processing, with minimal need for inter-communication and synchronisation. 

• The new algorithm produces incremental output, which makes it well-suited for early termination 
and further parallelisation. 

• The tree traversal supports a natural sense of "progress tracking" , so that users can gain a rough 
sense for how far they are through the enumeration procedure. 

In Section |BJ we measure the performance of the new algorithm against the prior state of the 
art, using a rich test suite with hundreds of problems that span a wide range of difficulties. For 
simple problems we find the new algorithm slower, but as the problems become more difficult the new 
algorithm quickly outperforms the old, often running orders of magnitude faster. We conclude that 
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for difficult problems this new tree traversal algorithm is superior, owing to both its stronger practical 
performance and its desirable theoretical attributes as outlined above. 

2 Preliminaries 

For decision problems based on normal surface theory, the input is typically a 3-manifold triangulation. 
This is a collection of n tetrahedra where some or all of the An faces are affinely identified (or "glued 
together") in pairs, so that the resulting topological space is a 3-manifold (possibly with boundary). 
The size of the input is measured by the number of tetrahedra, and we refer to this quantity as n 
throughout this paper. 

We allow two faces of the same tetrahedron to be identified together; likewise, we allow different 
edges or vertices of the same tetrahedron to be identified. Some authors refer to such triangulations 
as generalised triangulations. In essence, we allow the tetrahedra to be "bent" or "twisted" , instead of 
insisting that they be rigidly embedded in some larger space. This more flexible definition allows us 
to keep the input size n small, which is important when dealing with exponential algorithms. 




Figure 1: A 3-manifold triangulation of size n = 2 

Figure [T] illustrates a triangulation with n = 2 tetrahedra: the back two faces of each tetrahedron 
are identified with a twist, and the front two faces of the first tetrahedron are identified with the 
front two faces of the second tetrahedron. As a consequence, all eight tetrahedron vertices become 
identified together, and the edges become identified into three equivalence classes as indicated by the 
three different arrowheads. We say that the overall triangulation has one vertex and three edges. The 
underlying 3-manifold represented by this triangulation is the product space S 2 x S . 

A closed triangulation is one in which all An tetrahedron faces are identified in 2n pairs (as in 
the example above). A bounded triangulation is one in which some of the An tetrahedron faces are 
left unidentified; together these unidentified faces form the boundary of the triangulation. The un- 
derlying topological spaces for closed and bounded triangulations are closed and bounded 3-manifolds 
respectively. 

There are of course other ways of representing a 3-manifold (for instance, Heegaard splittings or 
Dehn surgeries). We use triangulations because they are "universal": it is typically easy to convert 
some other representation into a triangulation, whereas it can sometimes be difficult to move in the 
other direction. In particular, when we are solving problems in knot theory, we typically work with a 
triangulation of the knot complement — that is, the 3-dimensional space surrounding the knot. 

In normal surface theory, interesting surfaces within a 3-manifold triangulation can be encoded as 
integer points in M. 3n . Each such point x g ]R 3 ™ satisfies the following conditions: 

• x > and Mx = 0, where M is a matrix of matching equations that depends on the input 
triangulation; 
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• x satisfies the quadrilateral constraints: for each tripie of coordinates (x\, X2, £3), (#4, x$, Xq), 
. . . , (x3„__2, ^3n-i, x?,n), at most one of the three entries in the triple can be non-zero. There are 
n such triples, giving n such restrictions in total. 

Any point x G R 3n that satisfies all of these conditions (that is, x > 0, Mx = and the quadrilateral 
constraints) is called admissible. 

It should be noted that normal surface theory is often described using a different coordinate sys- 
tem (R 7 ™, also known as standard coordinates). Our coordinates in R 3n are known as quadrilateral 
coordinates, and were introduced by Tollefson [39] . They are ideal for computation because they are 
smaller and faster, and because information that is "lost" from standard coordinates is typically quick 
to reconstruct [9]. 

These coordinates have a natural geometric interpretation, which we outline briefly. Typically 
one can arrange for an interesting surface within the 3-manifold to intersect each tetrahedron of the 
triangulation in a collection of disjoint triangles and / or quadrilaterals, as illustrated in Figure [5J 
The corresponding admissible integer point in R 3 ™ counts how many quadrilaterals slice through each 
tetrahedron in each of three possible directions. From this information, the locations of the triangles 
can also be reconstructed, under some weak assumptions about the surface. 



Figure 2: Triangles and quadrilaterals in a tetrahedron 

The matching equations ensure that triangles and quadrilaterals in adjacent tetrahedra can be 
joined together, and the quadrilateral constraints ensure that the resulting surface does not intersect 
itself. We do not use this geometric interpretation in this paper, and we do not discuss it further; for 
details the reader is referred to |24j . 

We recount some well-known facts about the matching equations [9j [33 [39] : 

Lemma 2.1. Let e and v be the number of edges and vertices of the triangulation that do not lie within 
the boundary. Then there are e matching equations; that is, M is an e x 3n matrix. In the case of a 
closed triangulation, we have e = n + V > n. 

In general one or more matching equations may be redundant. In the case of a closed triangulation, 
the rank of M is precisely n. 

Lemma 2.2. The matrix M is sparse with small integer entries. Each column contains at most four 
non-zero entries, each of which is ±1, ±2, ±3 or ±4. Moreover, the sum of absolute values in each 
column is at most 4; that is, |Mi,c| < 4 for all c. If the triangulation is orientable, the only possible 
non-zero entries are ±1 and ±2. 

We define the solution cone J2 V to be the set 

J v = {xe R 3 " I x > and Mx = 0} . 

This is a polyhedral cone with apex at the origin. From above, it follows that interesting surfaces 
within our 3-manifold are encoded as integer points within the solution cone. The projective solution 
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space J2 is defined to be a cross-section of this cone: 

£= {xe j2 v |^xi = l}. 

It follows that the projective solution space is a (bounded) polytope with rational vertices. In general, 
both the solution cone and the projective solution space are highly degenerate (that is, extreme rays 
of =2 V and vertices of =2 often belong to many more facets than required by the dimension). 

For polytopes and polyhedra, we follow the terminology of Ziegler [40]: polytopes are always 
bounded, polyhedra may be either bounded or unbounded, and all polytopes and polyhedra are as- 
sumed to be convex. 

It is important to note that the quadrilateral constraints do not feature in the definitions of =2 V 
or =2. These constraints are combinatorial in nature, and they cause significant theoretical problems: 
they are neither linear nor convex, and the region of =2 that obeys them can be disconnected or can 
have non-trivial topology (such as "holes"). Nevertheless, they are extremely powerful: although the 
vertices of =2 can be numerous, typically just a tiny fraction of these vertices satisfy the quadrilateral 
constraints [TU1 H3] ■ 

For many topological decision problems that use normal surface theory, a typical algorithm runs 
as follows: 

1. Enumerate all admissible vertices of the projective solution space =2 (that is, all vertices of =2 
that satisfy the quadrilateral constraints); 

2. Scale each vertex to its smallest integer multiple, "decode" it to obtain a surface within the 
underlying 3-manifold triangulation, and test whether this surface is "interesting" ; 

3. Terminate the algorithm once an interesting surface is found, or once all vertices are exhausted. 

The definition of "interesting" varies for different decision problems. For more of the overall topological 
context see |24j . and for more details on quadrilateral coordinates and the projective solution space 
see 0. 

As indicated in the introduction, step 1 (the enumeration of vertices) is typically the computational 
bottleneck. Because so few vertices are admissible, it is crucial for normal surface enumeration algo- 
rithms to enforce the quadrilateral constraints as they go — one cannot afford to enumerate the entire 
vertex set of *2 (which is generally orders of magnitude larger) and then enforce the quadrilateral 
constraints afterwards. 

Traditionally, normal surface enumeration algorithms are based on the double description method 
of Motzkin et al. |34j , with additional algorithmic improvements specific to normal surface theory |12j . 
In brief, we construct a sequence of polytopes Pq, Pi, . . . , P e , where Po is the unit simplex in R 3 ™, P e 
is the final solution space =2, and each intermediate polytope Pi is the intersection of the unit simplex 
with the first i matching equations. The algorithm works inductively, constructing P^+i from Pj at 
each stage. For each intermediate polytope Pi we throw away any vertices that do not satisfy the 
quadrilateral constraints, and it can be shown inductively that this yields the correct final solution set. 

It is worth noting why backtracking and reverse search algorithms have not been used for normal 
surface enumeration to date: 

• Reverse search algorithms map out vertices by pivoting along edges of the polytope. This makes 
it difficult to incorporate the quadrilateral constraints: since the region that satisfies these con- 
straints can be disconnected, we may need to pivot through vertices that break these constraints 
in order to map out all solutions. Degeneracy can also cause significant performance problems 
for reverse search algorithms [TJ [2] . 
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• Backtracking algorithms receive comparatively little attention in the literature. Fukuda et al. [19] 
show that backtracking can solve the face enumeration problem in polynomial time. However, 
for vertex enumeration their framework requires a solution to the NP-complete restricted vertex 
problem, and they conclude that straightforward backtracking is unlikely to work efficiently for 
vertex enumeration in general. 

Our new algorithm belongs to the backtracking family. Despite the concerns of Fukuda et al., we 
find in Section [5] that for large problems our algorithm is a significant improvement over the prior state 
of the art, often running orders of magnitude faster. We achieve this by introducing domination tests, 
which are time and space trade-offs that allow us to avoid solving the restricted vertex problem directly. 
The full algorithm is given in Section [3] below, and in Sections [5] and [6] we study these trade-offs both 
theoretically and experimentally. 

3 The tree traversal algorithm 

The basic idea behind the algorithm is to construct admissible vertices x 6 J3 by iterating through 
all possible combinations of which coordinates Xi are zero and which coordinates Xi are non-zero. We 
arrange these combinations into a search tree of height n, which we traverse in a depth-first manner. 
Using a combination of feasibility tests and domination tests, we are able to prune this search tree 
so that the traversal is more efficient, and so that the leaves at depth n correspond precisely to the 
admissible vertices of =S. 

Because the quadrilateral constraints are expressed purely in terms of zero versus non-zero coordi- 
nates, we can easily build the quadrilateral constraints directly into the structure of the search tree. 
We do this with the help of type vectors, which we introduce in Section [3. f I In Section [321 we describe 
the search tree and present the overall structure of the algorithm, and we follow in Sections 13.31 and !3. 41 
with details on some of the more complex steps. 

3.1 Type vectors 

Recall the quadrilateral constraints, which state that for each i = I, . . . ,n, at most one of the three 
coordinates 2:3,-2, £31-1, x 3 i can be non-zero. A type vector is a sequence of n symbols that indicates 
how these constraints are resolved for each i. In detail: 

Definition 3.1 (Type vector). Let x = (x\, . . . , x 3n ) £ M. 3n be any vector that satisfies the quadrilat- 
eral constraints. We define the type vector of x to be r(x) = (n, . . . , r„) <E {0, 1, 2, 3}™, where 



For example, consider the vector x = (0, 1,0, 0,0,4, 0,0,0, 0,0,2) £ M 12 where n = 4. Here the 
type vector of x is r(x) = (2, 3, 0, 3). 

An important feature of type vectors is that they carry enough information to completely recon- 
struct any vertex of the projective solution space, as shown by the following result. 

Lemma 3.2. Let x be any admissible vertex of J3. Then the precise coordinates x%, . . . ,x 3n can be 
recovered from the type vector r(x) by solving the following simultaneous equations: 





1 

2 
3 



if x 3i - 2 = x 3i -i = x 3i = 0; 
if x 3i - 2 ^ and x 3i _i = x 3i 
if x 3i -i ^ and x 3i - 2 = x 3i 
if x 3i ^ and x 3i - 2 = x 3l -i 



0; 
0: 
0. 



(3.1) 
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• the matching equations Mx = 0; 

• the projective equation ^ Xi = 1; 

• the equations of the form Xj — as dictated by \3. 1\) above, according to the particular value ( 0, 
1, 2 or 3) of each entry in the type vector r(x). 

Proof. This result simply translates Lemma 4.4 of |12j into the language of type vectors. However, it 
is also a simple consequence of polytope theory: the type vector indicates which facets of =2 the point 
x belongs to, and any vertex of a polytope can be reconstructed as the intersection of those facets to 
which it belongs. See [12] for further details. □ 

Note that this full recovery of x from t(x) is only possible for vertices of =2, not arbitrary points of 
J2. In general, the type vector r(x) carries only enough information for us to determine the smallest 
face of J? to which x belongs. 

However, type vectors do carry enough information for us to identify which admissible points x G =2 
are vertices of i?. For this we introduce the concept of domination. 

Definition 3.3 (Domination). Let r, a G {0, 1,2,3}™ be type vectors. We say that r dominates a if, 
for each i — 1, . . . , n, either Oi — Tj or o~i — 0. We write this as r > a. 

For example, if n = 4 then (1,0,2,3) > (1,0,2,0) > (1,0,2,0) > (1,0,0,0). On the other hand, 
neither of (1, 0, 2, 0) or (1, 0, 3, 0) dominates the other. It is clear that in general, domination imposes 
a partial order (but not a total order) on type vectors. 

Remark. We use the word "domination" because domination of type vectors corresponds precisely 
to domination in the face lattice of the polytope £}. For any two admissible points x, y G R 3 ™, we see 
from Definition ^ . 1 1 that r(x) > r(y) if and only if, for every coordinate position i where Xi = 0, we also 
have yi — 0. Phrased in the language of polytopes, this means that r(x) > r(y) if and only if every 
face of i? containing x also contains y. That is, r(x) > r(y) if and only if the smallest-dimensional 
face containing y is a subface of the smallest-dimensional face containing x. 

These properties allow us to characterise admissible vertices of the projective solution space entirely 
in terms of type vectors, as seen in the following result. 

Lemma 3.4. Let x G be admissible. Then the following statements are equivalent: 

• x is a vertex of J2; 

• there is no admissible point y G =2 for which t(x) > r(y) and x ^ y; 

• there is no admissible vertex y o/J for which t(x) > r(y) and x ^ y. 

Proof. This is essentially Lemma 4.3 of [9], whose proof involves elementary polytope theory with some 
minor complications due to the quadrilateral constraints. Here we merely reformulate this result in 
terms of type vectors. See [5] for further details. □ 

Because our algorithm involves backtracking, it spends much of its time working with partially- 
constructed type vectors. It is therefore useful to formalise this notion. 
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Definition 3.5 (Partial type vector). A partial type vector is any vector r = (ti, . . . , r„), where each 
Ti € {0, 1,2,3,—}. We call the special symbol "— " the unknown symbol. A partial type vector that 
does not contain any unknown symbols is also referred to as a complete type vector. 

We say that two partial type vectors r and a match if, for each i — 1, . . . , n, either Ti = er,; or at 
least one of Ti, <7j is the unknown symbol. 

For example, (1, — , 2) and (1, 3, 2) match, and (1, — , 2) and (1, 3, — ) also match. However, (1, — , 2) 
and (0, — ,2) do not match because their leading entries conflict. If r and a are both complete type 
vectors, then r matches a if and only if r = cr. 

Definition 3.6 (Type constraints). Let r = (ri,...,r n ) G {0,1,2,3,—}™ be a partial type vector. 
The type constraints for r are a collection of equality constraints and non-strict inequality constraints 
on an arbitrary point x = {x\, . . . , x 3n ) £ R 3 ™. For each i = 1, . . . , n, the type symbol Tj contributes 
the following constraints to this collection: 



Note that, unlike all of the other constraints seen so far, the type constraints are not invariant 
under scaling. The type constraints are most useful in the solution cone =S V , where any admissible 
point x £ i? v with Xj > can be rescaled to some admissible point Ax € =S V with Xxj > 1. We see 
this scaling behaviour in Lemma T3.7I below. 

The type constraints are similar but not identical to the conditions described by equation (|3.1[) 
for the type vector r(x), and their precise relationship is described by the following lemma. The 
most important difference is that (|3.1[) uses strict inequalities of the form Xj > 0, whereas the type 
constraints use non-strict inequalities of the form Xj > 1. This is because the type constraints will be 
used with techniques from linear programming, where non-strict inequalities are simpler to deal with. 

Lemma 3.7. Let a £ {0, 1, 2, 3, — }" be any partial type vector. J/x £ R 3 ™ satisfies the type constraints 
for a, then the type vector r(x) matches a. Conversely, i/x £ R 3n is an admissible vector whose type 
vector t(x) matches a , then Ax satisfies the type constraints for a for some scalar A > 0. 

Proof. This is a simple exercise in matching Definitions 13.11 and 13.61 Recall that admissible vectors x 
must satisfy x > 0, which is why we are able to convert Xj ^ into Xxj > 1. □ 

We finish this section on type vectors with a simple but important note. 

Lemma 3.8. In the projective solution space, type vectors are always non-zero. That is, there is no 
x 6 £2 for which r(x) = 0. 

Proof. If t(x) = then x = 0, by Definition 13. II However, every point in the projective solution space 
lies in the hyperplane ^ Xi = 1. □ 

3.2 The structure of the algorithm 

Recall that our overall plan is to iterate through all possible combinations of zero and non-zero co- 
ordinates. We do this by iterating through all possible type vectors, thereby implicitly enforcing the 
quadrilateral constraints as we go. The framework for this iteration is the type tree, which we define 
as follows. 



X3t-2 > 1 and a?3j_i = x 3i — 
xm-i > 1 and x 3i _ 2 = x 3i — 
X31 > 1 and x 3i - 2 = £3i-i = 
no constraints 



X3i-2 = X3i-l = %3i = 



if Ti 
if Ti 
if Ti 

if ^ 
if ^ 




1 

2 
3 
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Definition 3.9 (Type tree). The type tree is a rooted tree of height n, where all leaf nodes are at 
depth n, and where each non-leaf node has precisely four children. The four edges descending from 
each non-leaf node are labelled 0, 1, 2 and 3 respectively. 

Each node is labelled with a partial type vector. The root node is labelled (—,...,—). Each 
non-leaf node at depth i has a label of the form (n, . . . , Tj, — ), and its children along edges 

0, 1, 2 and 3 have labels (tj., ... ,n, 0, -), (n, n, 1, -), (n, ... ,t u 2, -) and 

(ti, ... ,Tj, 3, — ) respectively. Figure |3] illustrates the type tree of height n = 2. 



(-,-) 




(0,-) (1,-) (2,-) (3,-) 




(0,0) (0,1) (0,2) (0,3) (1,0) (1,1) (1,2) (1,3) (2,0) (2,1) (2,2) (2,3) (3,0) (3,1) (3,2) (3,3) 



Figure 3: The type tree of height n — 2 

The algorithm walks through the type tree in a depth-first manner, collecting admissible vertices of 
=2 as it goes. The tree is large however, with 0(4") nodes, and so we do not traverse the type tree in 
full. Instead we prune the tree so that we only follow edges that might lead to admissible vertices. The 
main tools that we use for pruning are feasibility tests (based on the type constraints) and domination 
tests (based on the previous vertices found so far). The details are as follows. 

Algorithm 3.10 (Tree traversal algorithm) . The following algorithm takes a 3-manifold triangulation 
as input, and outputs the set of all admissible vertices of the projective solution space i?. 

Construct the matching equations M ( see J§j] for details ), and delete rows until M has full rank. 
Initialise an empty set V; this will be used to store the type vectors for admissible vertices of JS. 

Beginning at the root of the type tree, process nodes recursively according to the following instruc- 
tions. In general, we process the node N as follows: 

• If N is a non-leaf node then examine the four children of N in turn, beginning with the child 
along edge (the order of the remaining children is unimportant). For a child node labelled with 
the partial type vector t, recursively process this child if and only if all of the following conditions 
are satisfied: 

(a) Zero test: r is not the zero vector. 

(b) Domination test: If we replace every unknown symbol — with 0, then t does not dominate 
any type vector in V . 

(c) Feasibility test: There is some point x £ R 3n that satisfies x > 0, Mx = 0, and the type 
constraints for r. 

• If N is a leaf node then its label is a complete type vector t, and we claim that t must in fact 
be the type vector of an admissible vertex of £? (we prove this in Theorem \3.11\) . Insert r into 
the set V , reconstruct the full vertex x £ K 3n from r as described by Lemma \3.2[ and output this 
vertex. 
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Both the domination test and the feasibility test are non-trivial. We discuss the details of these 
tests in Sections [3.31 and below, along with the data structure used to store the solution set V. On 
the other hand, the zero test is simple; moreover, it is easy to see that it only needs to be tested once 
(for the very first leaf node that we process). 

From the viewpoint of normal surface theory, it is important that this algorithm be implemented 
using exact arithmetic. This is because each admissible vertex x £ J2 needs to be scaled to its smallest 
integer multiple Ax € Z 3 ™ in order to extract useful information for topological applications. 

This is problematic because both the numerators and the denominators of the rationals that we 
encounter can grow exponentially large in n. For a naive implementation we could simply use arbitrary- 
precision rational arithmetic (as provided for instance by the GMP library [21]). However, the bounds 
that we prove in Section [4] of this paper allow us to use native machine integer types (such as 32-bit, 
64-bit or 128-bit integers) for many reasonable-sized applications. 

Theorem 3.11. Algorithm \3.1U\ is correct. That is, the output is precisely the set of all admissible 
vertices of J3. 

Proof. By Lemma \3. 21 the algorithm is correct if, when it terminates, the set V is precisely the set of 
type vectors of all admissible vertices of Si. We prove this latter claim in three stages. 

1. Every type vector in V is the type vector of some admissible point x£ J. 

Let a be some type vector in V . Because we reached the corresponding leaf node in our tree 
traversal, we know that a passes the feasibility test. That is, there is some x £ R 3n that satisfies 
x > 0, Mx = 0, and the type constraints for a. Moreover, a also passes the zero test, which 
means we can rescale x so that x i = 1 • 

Because a was obtained from a leaf node it must be a complete type vector, which means that 
the type constraints for a also enforce the quadrilateral constraints. In other words, this point 
x must be an admissible point in Si. 

From Lemma 13.71 we know that t(x) matches cr, and because a is complete it follows that 
t(x) = a . That is, a is the type vector of an admissible point in Si. 

2. For every admissible vertex x of Si, the type vector r(x) appears in V. 

All we need to show here is that we can descend from the root of the type tree to the leaf node 
labelled r(x) without being stopped by the zero test, the domination test or the feasibility test. 

Let N be any ancestor of the leaf labelled r(x) (possibly including the leaf itself), and let o be 
its label. This means that a can be obtained from r(x) by replacing zero or more entries with 
the unknown symbol. In particular, a matches t(x). 

We know that a passes the zero test by Lemma T3.8I From Lemma T3.7I there is some A > for 
which Ax satisfies the type constraints for a. Since x £ S it is clear also that Ax > and 
M Ax = 0, and so a passes the feasibility test. 

By Lemma l3~4l the type vector t(x) does not dominate r(y) for any admissible point y £ Si, and 
from stage Q] above it follows that r(x) does not dominate any type vector in V. The same is 
still true if we replace some elements of t(x) with (since this operation cannot introduce new 
dominations), which means that a passes the domination test. 

3. Every type vector in V is the type vector of some admissible vertex x G i2. 
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Suppose that some a E V is not the type vector of an admissible vertex. By stage [T] above we 
know there is some admissible point x £ =S for which a = t(x). Because x is not a vertex, it 
follows from Lemma T3.4I that a > r(y) for some admissible vertex y e JS. 

By stage above we know that r(y) also appears in V. Since a > r(y), the type vector r(y) 
can be obtained from a by replacing one or more entries with 0. This means that the algorithm 
processes the leaf labelled r(y) before the leaf labelled a, yielding a contradiction: the node 
labelled a should never have been processed, since it must have failed the domination test. 

Steps and [3] together show that V is precisely the set of type vectors of all admissible vertices of 
J2, and the algorithm is therefore proven correct. □ 

Remark. All three tests (zero, domination and feasibility) can be performed in polynomial time in 
the input and/or output size. What prevents the entire algorithm from running in polynomial time is 
dead ends: nodes that we process but which do not eventually lead to any admissible vertices. 

It is worth noting that Fukuda et al. [19] describe a backtracking framework that does not suffer 
from dead ends. However, this comes at a significant cost: before processing each node they must solve 
the restricted vertex problem, which essentially asks whether a vertex exists beneath a given node in 
the search tree. They prove this restricted vertex problem to be NP-complete, and they do not give 
an explicit algorithm to solve it. 

We discuss time and space complexity further in Section[5l where we find that — despite the presence 
of dead ends — this tree traversal algorithm enjoys significantly lower worst-case complexity bounds 
than the prior state of the art. This is supported with experimental evidence in Section [6] In the 
meantime, it is worth noting some additional advantages of the tree traversal algorithm: 

• The tree traversal algorithm lends itself well to parallelisation. 

The key observation here is that, for each non-leaf node N, the children along edges 1, 2 and 3 can 
be processed simultaneously (though edge still needs to be processed first). There is no need 
for these three processes to communicate, since they cannot affect each other's domination tests. 
This three-way branching can be carried out repeatedly as we move deeper through the type tree, 
allowing us to make effective use of a large number of processors if they are available. Distributed 
processing is also practical, since at each branch the data transfer has small polynomial size. 

• The tree traversal algorithm supports incremental output. 

If the type vectors of admissible vertices are reasonably well distributed across the type tree (as 
opposed to tightly clustered in a few sections of the tree), then we can expect a regular stream of 
output as the algorithm runs. From practical experience, this is indeed what we see: for problems 
that run for hours or even days, the algorithm outputs solutions at a continual (though slowing) 
pace, right from the beginning of the algorithm through until its termination. 

This incremental output is important for two reasons: 

— It allows early termination of the algorithm. For many topological decision algorithms, our 
task is to find any admissible vertex with some property P (or else conclude that no such 
vertex exists). As the tree traversal algorithm outputs vertices we can test them immediately 
for property P, and as soon as such a vertex is found we can terminate the algorithm. 

— It further assists with parallelisation for problems in which testing for property P is expen- 
sive (this is above and beyond the parallelisation techniques described above). An example 
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is the Hakenness testing problem, where the test for property P can be far more difficult 
than the original normal surface enumeration [2]. For problems such as these, we can 
have a main tree traversal process that outputs vertices into a queue for processing, and 
separate dedicated processes that simultaneously work through this queue and test vertices 
for property P. 

This is a significant improvement over the double description method (the prior state of the art 
for normal surface enumeration) . As outlined in Section [2] the double description method works 
inductively through a series of stages, and it typically does not output any solutions at all until 
it reaches the final stage. 

• The tree traversal algorithm supports progress tracking. 

For any given node in the type tree, it is simple to estimate when it appears (as a percentage of 
running time) in a depth-first traversal of the full tree. We can present these estimates to the 
user as the tree traversal algorithm runs, in order to give them some indication of the running 
time remaining. 

Of course such estimates are inaccurate: they ignore pruning (which allows us to avoid significant 
portions of the type tree) , and they ignore the variability in running times for the feasibility and 
domination tests (for instance, the domination test may slow down as the algorithm progresses 
and the solution set V grows). Nevertheless, they provide a rough indication of how far through 
the algorithm we are at any given time. 

Again this is a significant improvement over the double description method. Some of the stages 
in the double description method can run many orders of magnitude slower than others [3J [5] , 
and it is difficult to estimate in advance how slow each stage will be. In this sense, the obvious 
progress indicators (i.e., which stage we are up to and how far through it we are) are extremely 
poor indicators of global progress through the double description algorithm. 



- O (3.303™) as 
V\ < 4" (the total 



3.3 Implementing the domination test 

What makes the domination test difficult is that the solution set V can grow exponentially large jTD] . 

For closed triangulations, the best known theoretical bound is \V\ € O ^ 3+ ^^ 

proven in [13] . and for bounded triangulations there is only the trivial bound 
number of leaves in the type tree) . 

The central operation in the domination test is to decide, given a complet^f) type vector r, whether 
there exists some a € V for which r > a. A naive implementation simply walks through the entire set 
V, giving a time complexity of 0(n|y|) (since each individual test of r > a runs in 0(n) time). 

However, we can do better with an efficient data structure for V. A key observation is that, 
although we might have up to 4™ type vectors in V, there are at most 2™ candidate type vectors a 
that could possibly be dominated by r. More precisely, suppose that r contains k non-zero entries and 
n — k zero entries (where < k < n) . Then the only possible type vectors that r could dominate are 
those obtained by replacing some of the non-zero entries in r with zeroes, yielding 2 fc such candidates 
in total. 

We could therefore store V using a data structure that supports fast insertion and fast lookup 
(such as a red-black tree), and implement the domination test by enumerating all 2 k candidate vectors 



3 Recall that for the domination test we replace every unknown symbol — with 0. 
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a and searching for each of them in V. This has a time complexity of 0(n2 k log | V\), which can be 
simplified to 0{n 2 2 k ). 

Although this looks better than the naive implementation in theory, it can be significantly worse 
in practice. This is because, although \V\ has a theoretical upper bound of either 0(3.303") or 4™, 
in practice it is far smaller. Detailed experimentation on closed triangulations [TO] suggests that on 
average |V| grows at a rate below 1.62™, which means that enumerating all candidate vectors a can 
be far worse than simply testing every a £ V . 

A compromise is possible that gives the best of both worlds. We can represent V using a tree 
structure that mirrors the type tree but stores only those nodes with descendants in V. This is 
essentially a trie (i.e., a prefix tree), and is illustrated in Figure |4j 



(-,-.-) 




(0,-,-) (1,--) (3,-,-) 




(0,0,-) (0,2,-) (1,1,-) (3,0,-) 




(0,0,1) (0,2,3) (1,1,0) (1,1,3) (3,0,2) 

Figure 4: A trie representing V = {(0,0,1), (0,2,3), (1,1,0), (1,1,3), (3,0,2)} 

To implement the domination test we simply walk through the portion of the trie that corresponds 
to nodes dominated by r. This walk spans at most 2 k leaf nodes plus at most n2 k ancestors. However, 
we also know that the trie contains at most n\V\ nodes in total (ancestors included). The walk therefore 
covers 0(min{n|V|, n2 k }) nodes, and since each comparison r > a comes for free with the structure 
of the trie, we have an overall running time of (^(minlniyi, n2 k }) for the domination test. 

3.4 Implementing the feasibility test 

Recall that the feasibility test asks whether any point x <E K 3n satisfies x > 0, Mx = 0, and the type 
constraints for a given partial type vector r. Tests of this kind are standard in the linear programming 
literature, and can be performed in polynomial time using interior point methods |17l 128] . 

However, our implementation of the feasibility test is based not on interior point methods, but on 
the dual simplex method. This is a pivot-based algorithm, and although it has a worst-case exponential 
running time in theory, it is simple to code and performs extremely well in practice 33 36,.. Moreover, 
the dual simplex method makes it easy to add new constraints to a previously-solved problem, allowing 
us to incrementally update our solutions x £ R 3 ™ as we walk through the type tree. For details on the 
dual simplex method, the reader is referred to a standard text on linear programming such as [S]- 

Here we outline the framework of the dual simplex method as it applies to the feasibility test. We 
explicitly describe the various matrices and vectors that appear, since these matrices and vectors play 
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a key role in the results of Section 2J 

Our first step is to simplify the integer matrix of matching equations M: 

Definition 3.12 (Reduced matching equations). Let M be the integer matrix of matching equations, 
as described in Section [5] The reduced matching equation matrix M is obtained from M by dividing 
each non-zero column by its greatest common divisor, which is always taken to be positive. 

An example matrix M and the corresponding reduced matrix M are illustrated below. The first 
and fourth columns are divided by 2, and the remaining columns are left unchanged. 

1-1 1 -1 -1 " 
-10 2-102 

It is clear from Lemma [2~2l that each greatest common divisor is 1, 2, 3 or 4, and in the case of an 
orientable triangulation just 1 or 2. Although this reduction is small, it turns out to have a significant 
impact on the arithmetical bounds that we prove in Section @] 

Although a solution to Mx = need not satisfy Mx = (and vice versa), the two matrices are 
interchangeable for our purposes as shown by the following simple lemma. 

Lemma 3.13. Replacing M with M does not affect the results of the feasibility test. In other words, 
for any partial type vector t the following two statements are equivalent: 

• There exists an x G R 3 ™ satisfying x > ; Afx = and the type constraints for t; 

• There exists an x' € M 3 ™ satisfying x' > 0, Mx' = and the type constraints for r. 

Proof. For each i — 1, . . . , 3n, let di denote the greatest common divisor of the ith column of M (or 1 
if the ith column of M is zero). In addition, let D = lcm(<ii, . . . , d^n). 

Suppose that x = (x%, . . . , ai3 n ) satisfies x > 0, Mx = and the type constraints for r. Then it is 
clear that x' = (dix±, . . . , d^nX^n) satisfies x' > 0, Mx' = and the type constraints for r. 

Conversely, suppose that x' = (x' 1: . . . ,x' 3n ) satisfies x' > 0, Afx' = and the type constraints 
for r. In this case, we find that x = (Dx[/di, . . . , Dx' Zn / d^n) satisfies x > 0, Mx = and the type 
constraints for r. □ 

Our next step is to replace type constraints of the form Xi > 1 with equivalent constraints of the 
form x[ > 0. 

Lemma 3.14. Let r be any partial type vector, and let the type constraints for r be Xi — for all 
i £ I and Xj > 1 for all j G J. Then the following two statements are equivalent: 

• There exists an x G M 3n for which x > 0, Mx = 0, Xi = for all i E I and Xj > 1 for all j G J ; 

• There exists an x' G M 3 " for which x' > 0, Afx' = b and x\ — for all i G I , where b = 
— Xl^ej Mj, and where Mj refers to the jth column of M . 

Proof. The two statements are equivalent under the substitution x'a = Xj — 1 for j £ J and Xj = Xj 
for j <£ J. □ 

Based on Lemmata 13.131 and l3.14[ we structure the feasibility test as follows. 



M 



1-1 2-1-1 
-2 2 -2 2 



M 



15 



Algorithm 3.15 (Framework for feasibility test). To perform the feasibility test, we use the dual 
simplex method to search for a solution x £ K 3n for which x > 0, Mx = b and x% = for all i G /. 
Here the vector b and the index set I reflect the partial type vector t that we are working with, and we 
construct both b and I incrementally as follows: 

• When we begin at the root node with t = (—,...,—), we set b = and 7 = 0. 

• Suppose we step down from a parent node labelled (n, . . . , r^, —,—,...,—) to a child node labelled 
(n, • • • , Tk,Tk+i, -, ...,-)• 

— If Tk+\ = then we insert the three new indices 3k + 1, 3fc + 2 and 3fc + 3 into the set I, 
reflecting the additional type constraints X3k+i = ^3fc+2 = X3k+3 = 0. 

— If Tfc + i = 1 then we insert the two new indices 3k + 2 and 3k + 3 into I and subtract the 
column M^k+i from b, reflecting the additional type constraints £3/0+2 = £3^+3 = and 
Xzk+l > 1. 

— The cases Tk+i = 2 and 3 are handled in a similar fashion to Tk+\ = 1 above. 

It is important to note that this "step down" operation involves only minor changes to b and /. 
This means that, instead of solving each child feasibility problem from scratch, we can use the solution 
x from the parent node as a starting point for the new dual simplex search when we introduce our 
new constraints. We return to this shortly. 

Recall from Algorithm 13.101 that M is full rank, and suppose that rank M — rankM = k. To 
perform the feasibility test we maintain the following structures (all of which are common to revised 
simplex algorithms, and again we refer the reader to a standard text [5] for details): 

• a basis (3 = . . . , /3k} C {1,..., 3n}, which identifies k linearly independent columns of M; 

• the k x k inverse matrix Mp 1 , where Mp denotes the k x k matrix formed from columns f3i, . . . , j3k 
of M. 

It is useful to think of M^ 1 as a matrix of row operations that we can apply to both M and b. 

The product M^" 1 M includes the k x k identity as a submatrix, and the product M^h identifies the 
elements of a solution vector x as follows. 

Any basis /? describes a solution x 6 R 3n to the equation Mx = b, in which xj = for each 
j ^ /3, and where the product M^ l h lists the remaining elements {x$ x , . . . , x@ k ). Such a solution only 
satisfies x > if we have M^b > 0. We call the basis feasible if M^" 1 b > 0, and we call it infeasible 
if M^b ^ 0. 

The aim of the dual simplex method is to find a feasible basis. It does this by constructing some 
initial (possibly infeasible) basis0 and then repeatedly modifying this basis using pivots. Each pivot 
replaces some index in /3 with some other index not in (3, and corresponds to a simple sequence of 
row operations that we apply to M^ 1 . Eventually the dual simplex method produces either a feasible 
basis or a certificate that no feasible basis exists. We apply this procedure to our problem as follows: 

Algorithm 3.16 (Details of feasibility test). At the root node of the type tree, we simply construct 
any initial basis and then pivot until we obtain either a feasible basis or a certificate that no feasible 
basis exists (in which case the root node fails the feasibility test). 

4 The dual simplex method usually has an extra requirement that this initial basis be dual feasible. For our application 
there is no objective function to optimise, and so this extra requirement can be ignored. 
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When we step down from a parent node to a child node, we "inherit" the feasible basis from the 
parent and use this as the initial basis for the child. This may no longer be feasible because b might 
have changed; if not then once again we pivot until the basis can be made feasible or we can prove that 
no feasible basis exists. 

Recall that we enlarge the index set I when we step to a child node, which means that we introduce 
new constraints of the form Xj — 0. Each constraint Xj = is enforced as follows: 

• If j is not in the basis then we simply eliminate the variable Xj from the system ( effectively 
deleting the corresponding column from M ). 

• If j is in the basis then we pivot to remove it and eliminate Xj as before. If we cannot pivot j 
out of the basis then some row of the product M7 M must be zero everywhere except for column 

j, which means the equation M^Mx = Mg h implies Xj — z for some scalar z. //z ^ then 
it is impossible to enforce Xj — 0. If z = then Xj = is enforced automatically, and there is 
nothing we need to do. 

The child node is deemed to pass the feasibility test if and only if we are able to obtain a feasible 
basis after enforcing all of our new constraints. 

The procedures described in Algorithm 13. 161 are all standard elements of the dual simplex method, 
and once more we refer the reader to a standard text [5] for details and proofs. There are typically 
many choices available for which pivots to perform; one must be careful to choose a pivoting rule 
that avoids cycling, since the solution cone =S V is highly degenerate. We use Bland's rule [7] in our 
implementation. 

4 Arithmetical bounds 

The tree traversal algorithm involves a significant amount of rational arithmetic — most notably, in the 
feasibility test (Section l3.4l) and the reconstruction of vertices of £2 from their type vectors fLemma l3.2p . 
Moreover, the rationals that we work with can have exponentially large numerators and denominators, 
which means that in general we must use arbitrary precision arithmetic (as provided for instance by 
the GMP library [21]). 

The aim of this section is to derive explicit bounds on the rational numbers that appear in our 
calculations. In particular, Theorem 14.41 bounds the elements of the matrix MJ 1 that we manipulate 
in the feasibility test, and Corollaries 14.51 and 14.61 bound the coordinates of the final vertices of J?. 

These results are useful because the relevant bounds can be precomputed at the beginning of the 
algorithm, and for many reasonable-sized problems they allow us to work in native machine integer 
types (such as 32- bit, 64-bit or 128-bit integers). These native types are significantly faster than arbi- 
trary precision arithmetic: experimentation with the tree traversal algorithm shows a speed increase 
of over 10 times. 

Furthermore, Corollary 14.61 is a significant improvement on the best known bounds for the coordi- 
nates of vertices of =S. These have been independently studied by Hass, Lagarias and Pippenger [24] 
and subsequently by Burton, Jaco, Letscher and Rubinstein; we recount these earlier results before 
presenting Corollary 14.61 below. 

Central to all of the results in this section is the bounding constant 5{M). This is a function of the 
reduced matching equation matrix M, and to obtain the tightest possible bounds it can be computed 
at runtime when the tree traversal algorithm begins. If global quantities are required then Lemma 14.21 
and Corollary 14. 31 bound S(M) in terms of n alone. 



17 



Definition 4.1 (Bounding constant). The bounding constant S(M) is defined as follows. For each 
column c = 1, . . . , 3n, compute the Euclidean length of the cth column of Af ; that is, (^2 i Mf c )i. We 
define S(M) to be the product of the k largest of these lengths, where k = rank Af. 



For example, consider again the matrix 

M = 



1-1 1-1-1 
10 2-102 



which has rank k = 2 and column lengths 1,1, a/5, v2, 1 and a/5. Here the two largest column lengths 
are both a/5, and so S(M) — a/5 ■ a/5 = 5. 

Lemma 4.2. For an arbitrary 3-manifold triangulation, 5(M) < (\/T0) lankA/ < (a/10) 3 ". For an 
orientable triangulation, we can improve this bound to 5(M) < (y/E) TaakM < (a/6) 3 ™. 



Proof. Using Lemma 12.21 and the fact that each column of M has gcd = 1 , there are only a few 
possibilities for the non-zero entries in a column of M. In the orientable case the only options are 
(±1,±1,±1,±1), (±1,±1,±1), (±1,±1), (±1), (±2,±1,±1) and (±2,±1). For non-orient able trian- 
gulations we have the additional possibility of (±3, ±1). 

It follows that each column has Euclidean length < a/6 in the orientable case and < a/10 otherwise, 
yielding bounds of S(M) < (\/6) rankM and S(M) < (A/lO) rankM respectively. To finish we note that 
Af has < 3n column^] and so rankAf < 3n. □ 

A result of Tillmann |38] shows that rank Af = n for any closed 3-manifold triangulation, which 
allows us to tighten our bounds further: 



Corollary 4.3. For an arbitrary closed 3-manifold triangulation we have S(A1) < (a/10)", and for a 
closed orientable 3-manifold triangulation we have S(M) < (a/6)™. 

Remark. In fact these bounds are general: both S(M) < (a/6)" for the orientable case and S(M) < 
(a/10)" otherwise hold for triangulations with boundary also. The argument relies on a proof that 
rankAf = e — v < n, where e and v are the number of edges and vertices of the triangulation that 
do not lie within the boundary. This proof involves topological techniques beyond the scope of this 
paper; see [11] for the full details. 

We can now use 5(M) to bound the rational numbers that appear as we perform the feasibility 
test (Algorithms 13. 15l and [3~T5]) . Our main tool is Hadamard's inequality, which states that if A is any 
square matrix, | detA| is bounded above by the product of the Euclidean lengths of the columns of A. 

Theorem 4.4. Let j3 C {l,...,3n} be any basis as described in Section \3.4\ and let Ma 1 be the 

corresponding k x k inverse matrix, where k = rankAf. Then M7 can be expressed in the form 
Mp 1 = N/A, wh ere A is an integer with |A| < i5(Af), and where N is an integer matrix with 
\ n i,j\ ^ S(M) for every i,j. 



5 When we begin the tree traversal algorithm the matrix M has precisely 3n columns, but recall from Section l3.4l that 
we might delete columns from M as we move through the type tree. 
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Before proving this result, it is important to note that it holds for any basis — not just initial bases 
or feasible bases, but every basis that we pivot through along the way. This means that we can use 
Thcorcm l4.4l to place an upper bound on every integer that appears in every intermediate calculation, 
enabling us to use native machine integer types instead of arbitrary precision arithmetic when 5(M) 
is of a reasonable size. 

For example, consider the Weber-Seifert space, which is mentioned in the introduction and has been 
considered one of the benchmarks for progress^ in computational topology. Using the 23-tetrahedron 
triangulation described in [H] we find that 5(M) = 2 23 , which allows us to store the matrix Mp 1 using 
native (and very fast) 32-bit integers^ 

Proof of Theorem \4-4\ Using the adjoint formula for a matrix inverse we have M^ 1 = N/A, where 

N = adj Mp and A = det Mp. Because M is an integer matrix it is clear that A and all of the entries 
riij are integers. 

By Hadamard's inequality, | A| is bounded above by the product of the Euclidean lengths of the k 
columns of Mp, and it follows from Definition 14. II that |A| < 8(M). Since each adjoint entry m t j is a 
subdeterminant of Mp, a similar argument shows that \n^j\ < 5(M) for each □ 

To finish this section, we use Theorem l4.4l to bound the coordinates of any admissible vertex v G £}. 
More precisely, we bound the coordinates of u = Av, where u G i? v is the smallest integer multiple 
of v. This smallest integer multiple has a precise meaning for topologists (essentially, each integer Ui 
corresponds to a collection of Ui quadrilaterals embedded in some tetrahedron of the triangulation; see 
[91 |39j for full details). 

The first such bound was due to Hass, Lagarias and Pippenger [21], who proved that \ui\ < 128 n /2. 
Their result applies only to true simplicial complexes (not the more general triangulations that we allow 
here), and under their assumptions the matching equation matrix M is much simpler (containing only 
and ±1 entries). 

The only other bound known to date appears in unpublished notes by Burton, Jaco, Letscher and 
Rubinstein (c. 2001 and referenced in [IS]), where it is shown that \ui\ < (v 7 ^)" for a one-vertex closed 
orientable triangulation (this time generalised triangulations are allowed) Q 

In Corollary 14.51 we obtain new bounds on \ui\ in terms of the bounding constant 5{M), and in 
Corollary 14.61 we use this to bound \ui\ in terms of n alone, yielding significant improvements to both 
the Hass et al. and Burton et al. bounds outlined above. 

Corollary 4.5. Let u be the smallest integer multiple of an admissible vertex v £ £}. Then each 
coordinate Ui is bounded as follows: 

• \ui\ < (ink + 2) • 5(M) if the triangulation is orientable; 

• Wi\ < (36nfc + 12) • S(M) otherwise, 
where k = rankAf = rank M. 

6 Of course we must be careful: for instance, row operations of the form x <— Ax + /iy must be performed using 64- bit 
arithmetic, even though the inputs and outputs are guaranteed to fit into 32-bit integers. 

7 Both of these previously- known bounds were derived in standard coordinates (BS 7n ), not quadrilateral coordinates 
(1R 3 "). However, it is simple to convert between coordinate systems [9], and it can be shown that the upper bounds 
differ by a factor of at most An. 
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Proof. This proof is a simple (though slightly messy) matter of starting with Theorem l4 . 41 and following 
back through the various transformations and changes of variable until we reach our solution v £ £? 
to the original matching equations. 

Let t be the type vector of the vertex v, and let x £ K 3 " be the corresponding solution to the 
feasibility test as found by the dual simplex method in Algorithm 13.151 (so that x > and Mx = b) . 
Let (3 be the corresponding feasible basis found by the dual simplex method, so that every non-zero 
entry in x appears as an entry of the vector M7 b. 

From Lemma T3. 141 we recall that b is the negative sum of at most n columns of M. Following the 
same argument used in the proof of Lemma 14.21 each non-zero entry in M is either ±1, ±2 or ±3, 
and for an orientable triangulation just ±1 or ±2. It follows that each \bi\ < 2n if our triangulation is 
orientable, and each \bi\ < 3n otherwise. 

We return now to the vector x. As in Theorem 14.41 let M^ 1 = A/A, where A is an integer 
with |A| < S(M), and where each is an integer with \riij\ < S(M). Every non-zero element of x 
appears in the vector M^" 1 b, and is therefore of the form n i.jbj/ Combining all of the bounds 

above, we deduce that x = x'/A, where x' is an integer vector and where each \x[\ < 2nk5(M) if our 
triangulation is orientable, or \x[\ < 3nkS(M) otherwise. 

Our next task is to step backwards through Lemma l3.141 Our current vector x satisfies the second 
statement of Lemma [3.141 let y be the corresponding solution for the first statement, so that y > 0, 
My — 0, and y satisfies the type constraints for r. As in the proof of Lemma [3. 141 each t/j = Xi or 
Xi + 1. It follows that y = y'/A, where y' is an integer vector and where each < 2nkS(M) + A if 
our triangulation is orientable, or < 3nk5(M) + A otherwise. 

Finally we step backwards through Lemma [3.131 Our vector y satisfies the second statement of 
Lemma [3. 131 let z be the corresponding solution for the first statement, so that z > 0, Mz = 0, and z 
satisfies the type constraints for r. Following the proof of Lemma 13.131 each Zi — Dyi/di, where di is 
the gcd of the zth column of M, and where D = lcm(di, . . . , d^ n ). By Lemma [2.21 we have D < 2 for 
an orientable triangulation and D < 12 otherwise. Therefore z = z'/A, where z' is an integer vector 
and where each |z,-| < 4nkS(M) + 2A if our triangulation is orientable, or \z[\ < 36nkS(M) + 12A 
otherwise. 

To conclude, we observe that our vertex v is a multiple of z, and so the entries in the smallest 
integer multiple u are bounded by the entries in the (possibly larger) integer multiple z'. In particular, 
\m\ < Ank8{M) + 2A < {Ank + 2) • S(M) if our triangulation is orientable, or \u t \ < 36nkS(M) + 12 A < 
(36nfc + 12) • 5(M) otherwise. □ 

We can recast these bounds purely in terms of n: for S(M) we use Lemma 14.21 and for the rank 
k we use Tillmann's theorem that rank A/ = n for closed triangulations [38], or the observation that 
rankM < 3n for bounded triangulations. The result is the following. 

Corollary 4.6. Let u be the smallest integer multiple of an admissible vertex of =S. Then each 
coordinate Ui is bounded as follows: 

• < (An 2 + 2) ■ (v^)" if the triangulation is closed and orientable; 

• < (36n 2 + 12) • (-\/l0) n if the triangulation is closed and non- orientable; 

• < (12n 2 + 2) • (Vfi) 3 " if the triangulation is bounded and orientable; 

• \ u i\ < (108n 2 + 12) ■ (VTO) 3 ' 1 if the triangulation is bounded and non- orientable. 
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Remark. These bounds are significant improvements over previous results. For bounded triangula- 
tions, we improve the Hass et al. bound of 0(128") to O[n 2 (vT0) 3 "] ~ 0(n 2 31.6"), whilst removing 
the requirement for a true simplicial complex. For closed orientable triangulations, we improve the 
Burton et al. bound of 0[(VS) n ) to 0[n 2 (VE) n ], whilst removing the requirement for a one-vertex 
triangulation. 

As with Corollary I4.3[ the stronger bounds that we obtain for closed triangulations can in fact be 
shown to hold for bounded triangulations also. Again this relies on a proof that rankM = e — v < n, 
which calls on techniques beyond the scope of this paper. See [TT] for details. 

5 Time and space complexity 

In this section we derive theoretical bounds on the time and space complexity of normal surface 
enumeration algorithms. We compare these bounds for the double description method (the prior state 
of the art, as described in |12j ) and the tree traversal algorithm that we introduce in this paper. 

All of these bounds are driven by exponential factors, and it is these exponential factors that we 
focus on here. We replace any polynomial factors with a generic polynomial <p(n)', for instance, a 
complexity bound of 0(4™n 3 ) would simply be reported as 0(4 n (p(n)) here. For both algorithms it is 
easy to show that these polynomial factors are of small degree. 

We assume that all arithmetical operations can be performed in polynomial time. For the tree 
traversal algorithm this is a simple consequence of Theorem l4.41 and for the double description method 
it can likewise be shown that all integers have 0{n) bits. We omit the details here. 

Theorem 15.11 analyses the double description method, and Theorem 15.21 studies the tree traversal 
algorithm. In summary, the double description method can be performed in 0(l6 n tp(n)) time and 
0(4 n ip(n)) space, whereas the tree traversal algorithm can be carried out in 0(4 n \V\(p(n)) time and 
0(\V\if(n)) space, where |V| is the output size. This is a significant reduction, given that the output 
size for normal surface enumeration is often extremely small in practice^ If we reformulate these 
bounds in terms of n alone, the tree traversal algorithm requires 0(7 n ip(n)) time, and approximately 
O(3.303 n ip(n)) space for closed triangulations or 0(4 n tp(n)) space for bounded triangulations. 

We do not give bounds in terms of |V| for the double description method, because it suffers from 
a well-known combinatorial explosion: even when the output size is small, the intermediate structures 
that appear can be significantly (and exponentially) more complex. See [2] for examples of this in 
theory, or [9] for experiments that show this in practice for normal surface enumeration. The result 
is that, even for bounded triangulations where both algorithms have space complexities of 0(4 n (p(n)), 
the tree traversal algorithm can effectively exploit a small output size whereas the double description 
method cannot. 

In conclusion, both the time and space bounds for tree traversal are notably smaller, particularly 
when the output size is small. However, it is important to bear in mind that for both algorithms these 
bounds may still be far above the "real" time and memory usage that we observe in practice. For this 
reason it is important to carry out real practical experiments, which we do in Section [S] 

As a final note, for this theoretical analysis we always assume implementations that give the best 
possible complexity bounds. For the tree traversal algorithm we assume interior point methods for the 
feasibility test. For the double description method we assume that adjacency of vertices is tested using 
a polynomial-time algebraic rank test, rather than a simpler combinatorial test that has worst-case 

8 Early indications of this appear in |10| (which works in the larger space R 7n ), and a detailed study will appear in 
To illustrate how extreme these results are in R 3n : across all 139103 032 closed 3- manifold triangulations of size 
n = 9, the maximum output size is just 37, and the mean output size is a mere 9.7. 
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exponential time but excellent performance in practice |121 120) . For the experiments of Section [5] we 
revert to the dual simplex method and combinatorial test respectively, both of which are known to 
perform extremely well in practical settings [T3] [5D1 1331 131] • 

Theorem 5.1. The double description method for normal surface enumeration can be implemented in 
worst-case 0(16 n <p(n)) time and 0(4 n (p(n)) space. 

Proof. As outlined in Section [2] the double description method constructs a series of polytopes 
Po, . . . ,P e C R 3 ™ , where each Pi is the intersection of the unit simplex with the first i matching 
equations. There are e + l€ 0{n) such polytopes in total, and with an algebraic adjacency test we can 
construct each Pi in 0(^?_ 1 </3(n)) time and 0(vi-i(p(n)) space, where Vi-\ is the number of vertices of 
Pi-l that satisfy the quadrilateral constraints [T2"] . 

The main task then is to estimate each Vi. In every polytope Pi, each vertex is uniquely determined 
by which of its coordinates are zero [12] . There are 4 n possible choices of zero coordinates that satisfy 
the quadrilateral constraints, and so each Vi < 4™. This gives a time complexity of 0(16 n ip(n)) and 
space complexity of 0(A n ip(n)) for the algorithm overall. □ 

It should be noted that there are other ways of estimating the vertex counts Vi. For the final 
polytope, Burton [T3] proves that v e £ 0(3.303") for a closed triangulation; however, this result does 
not hold for the intermediate vertex counts v\, . . . , u e -i- McMullen's theorem [31] gives a tight bound 
on the number of vertices of an arbitrary polytope, but here it only yields Vi £ 0(4.24™); the reason it 
exceeds 4™ is that it does not take into account the quadrilateral constraints. 

Theorem 5.2. The tree traversal algorithm for normal surface enumeration can be implemented in 
worst-case 0(4: n \V\ip(n)) time and 0(\V\ip(n)) space, where \V\ is the number of admissible vertices 
of £2. In terms of n alone, the worst-case time complexity is 0(7 n tp(n)), and the worst-case space 



complexity is O i 
triangulations. 



3+VT3 
2 



(p(n) ) ~ O(3.303 n ip(n)) for closed triangulations or 0(A n <p(n)) for bounded 



Proof. The space complexity is straightforward. We do not need to explicitly construct the type tree in 
memory in order to traverse it, and so the only non-polynomial space requirements come from storing 
the solution set V. It follows that the tree traversal algorithm has space complexity 0(\V\<p(n)). By the 
same argument as before we have \V\ < 4™ for arbitrary triangulations, and for closed triangulations 



a result of Burton [13] shows that |V| G O 

The time complexity is more interesting. There are 0(4") nodes in the type tree, and the only non- 
polynomial operation that we perform on each node is the domination test. As described in Section r3.3[ 
for a node labelled with the partial type vector r the domination test takes 0(min {n\V\, n2 k }) time, 
where k is the number of times the symbols 1, 2 or 3 appear in r. It follows that the tree traversal 
algorithm runs in 0(4 n \V\(p(n)) time. 

To obtain a complexity bound that does not involve \V\, a simple counting argument shows that 
at most n(^ l )3 fe nodes in the type tree are labelled with a partial type vector r with the property 
described above. Therefore the total running time is bounded by 



O 



,fe=0 



<p(n) = O 



using the binomial expansion X^fe=o (fe)" = 0- 
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<p(n) =0(7>(n)) 
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6 Experimental performance 



In this section we put our algorithm into practice. We run it through a test suite of 275 triangulations, 
based on the first 275 entries in the census of knot complements tabulated by Christy and shipped 
with version 1.9 of Snap [15] . 

All 275 of these triangulations are bounded triangulations. We use bounded triangulations because 
they are a stronger "stress test" : in practice it has been found that enumerating normal surfaces for a 
bounded triangulation is often significantly more difficult than for a closed triangulation of comparable 
size j!4) . In part this is because the number of admissible vertices of J2 is typically much larger in the 
bounded case [TT] . 

We begin this section by studying the growth of the number of admissible vertices and the number 
of "dead ends" that we walk through in the type tree, where we discover that the number of dead ends 
is far below the theoretical bound of 0(4"). We follow with a direct comparison of running times for 
the tree traversal algorithm and the double description method, where we find that our new algorithm 
runs slower for smaller problems but significantly faster for larger and more difficult problems. 

Growth of the tree traversal 
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Figure 5: Counting solutions and dead ends for the tree traversal algorithm 

Figure [5] measures the "combinatorial size" of the tree traversal: the crosses in the lower part of 
the graph represent the final number of solutions, and the circles in the upper part of the graph count 
the total number of nodes that we visit (including solutions as well as dead ends). In the theoretical 
analysis of Section [5] we estimate both figures as 0(4 n ), but in practice we find that the real counts 
are significantly smaller. The number of solutions appears to grow at a rate of roughly 1.47", and the 
number of nodes that we visit grows at a rate of roughly 1.95" (though with some variation). The 
corresponding regression lines are marked on the gra ph@ 

9 These are standard linear regressions of the form logi/ = an + /3, where y is the quantity being measured on the 
vertical axis. 
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These small growth rates undoubtedly contribute to the strong performance of the algorithm, which 
we discuss shortly. However, Figure [5] raises another intriguing possibility, which is that the number 
of nodes that we visit might be polynomial in the output size. Indeed, for every triangulation in our 
test suite, the number of nodes that we visit is at most 10|T^| 2 (where \V\ represents the number 
of solutions). If this were true in general, both the time and space complexity of the tree traversal 
algorithm would become worst-case polynomial in the combined input and output size. This would be 
a significant breakthrough for normal surface enumeration. We return to this speculation in Section [7J 

We come now to a direct comparison of running times for the tree traversal algorithm and the double 
description algorithm. Both algorithms are implemented using the topological software package Regina 
|S]- In particular, the double description implementation that we use is already built into Regina] 
this represents the current state of the art for normal surface enumeration, and the details of the 
algorithm have been finely tuned over many years |12j . Both algorithms are implemented in C++, and 
all experiments were run on a single core of a 64-bit 2.3 GHz AMD Opteron 2356 processor. 

Comparison of running times 
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Figure 6: Comparing running times for the old and new algorithms 

Figure [6] directly compares the running times for both algorithms: each point on the graph repre- 
sents a single triangulation from the test suite. Both axes use a logarithmic scale. The solid diagonal 
line indicates where both algorithms have identical running times, and each dotted line represents 
an order of magnitude of difference between them. Based on this graph we can make the following 
observations: 

• Problems that are difficult for one algorithm are difficult for both. That is, there are no points 
in the top-left or bottom-right regions of the graph. This is not always the case for vertex 
enumeration problems in general [2]. 

• For smaller problems, the tree traversal algorithm runs slower (in the worst case, around 10 times 
slower). However, as the problems grow more difficult the tree traversal begins to dominate, and 
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for running times over 100 seconds the tree traversal algorithm is almost always faster (in the 
best case, around 100 times faster). 



Speed-up against output size 
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Figure 7: Measuring the performance improvement as the output size grows 

Figure [7] plots these speed-up factors explicitly: again each point on the graph represents a single 
triangulation, and the vertical axis measures the factor of improvement that we gain from the tree 
traversal algorithmic On the horizontal axis we plot the total number of solutions \V\, which is a 
useful estimate of the difficulty of the problem. Again, both axes use a logarithmic scale. 

Here we see the same behaviour as before, but more visibly: for smaller problems the tree traversal 
algorithm runs slower, and then as the problems grow larger the tree traversal algorithm dominates, 
running significantly faster in general for the most difficult cases. 



7 Discussion 

For topological decision problems that require lengthy normal surface enumerations, we conclude that 
the tree traversal algorithm is the most promising algorithm amongst those known to date. Not only 
does it have small time and space complexities in both theory and practice (though these complexities 
remain exponential), but it also supports parallelisation, progress tracking, incremental output and 
early termination. As described in Section I3.21 both parallelisation and incremental output offer great 
potential for normal surface theory, yet they have not been explored in the literature to date. 

One aspect of the tree traversal algorithm that has not been discussed in this paper is the ordering 
of tetrahedra. By reordering the tetrahedra we can alter which nodes of the type tree pass the feasibility 

10 Specifically, the vertical axis measures the double description running time divided by the tree traversal running 
time. 
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test, and thereby reduce the total number of nodes that we visit. Because this total number of nodes is 
the source of the exponential running time, this simple reordering of tetrahedra could have a great effect 
on the time complexity (much like reordering the matching equations does in the double description 
method [H [12]). Further exploration and experimentation in this area could prove beneficial. 

We finish by recalling the empirical observations of Section |H] that the total number of nodes 
visited — and therefore the overall running time — could be a small polynomial in the combined input 
and output size. The data presented in Section [5] suggest a quadratic relationship, and similar exper- 
imentation on closed triangulations suggests a cubic relationship. Even if we examine triangulations 
with extremely small output sizes (such as layered lens spaces, where the number of admissible vertices 
is linear in n [H]), this small polynomial relationship appears to be preserved (for layered lens spaces 
we visit 0(n 3 ) nodes in total). 

Although the tree traversal algorithm cannot solve the vertex enumeration problem for general 
polytopes in polynomial time, there is much contextual information to be gained from normal surface 
theory and the quadrilateral constraints. Further research into this polynomial time conjecture could 
prove fruitful: a counterexample would be informative, and a proof would be a significant breakthrough 
for the complexity of decision algorithms in low-dimensional topology. 
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